Compiler Generated Multithreading to Alleviate Memory Latency

نویسندگان

Kristof Beyls

Erik H. D'Hollander

چکیده

Since the era of vector and pipelined computing, the computational speed is limited by the memory access time. Faster caches and more cache levels are used to bridge the growing gap between the memory and processor speeds. With the advent of multithreaded processors, it becomes feasible to concurrently fetch data and compute in two cooperating threads. A technique is presented to generate these threads at compile time, taking into account the characteristics of both the program and the underlying architecture. The results have been evaluated for an explicitly parallel processor. With a number of common programs the data-fetch thread allows to continue the computation without cache miss stalls.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compiling irregular applications for reconfigurable systems

Algorithms that exhibit irregular memory access patterns are known to show poor performance on multiprocessor architectures, particularly when memory access latency is variable. Many common data structures, including graphs, trees, and linked-lists, exhibit these irregular memory access patterns. While FPGA-based code accelerators have been successful on applications with regular memory access ...

متن کامل

A Multithreaded Runtime System With Thread Migration for Distributed Memory Parallel Computing

Multithreading is very effective at tolerating the latency of remote memory accesses in distributed memory parallel computers, but does nothing to reduce the number or cost of those memory accesses. Compiler techniques and runtime approaches, such as caching remote memory accesses and prefetching, are often used to reduce the number of remote memory accesses. Another approach to reduce the numb...

متن کامل

Latency Tolerance through Multithreading in Large-Scale Multiprocessors

In large-scale distributed-memory multiprocessors, remote memory accesses su er signi cant latencies. Caches help alleviate the memory latency problem by maintaining local copies of frequently used data. However, they cannot eliminate the latency caused by rst-time references and invalidations needed to enforce cache coherence. Multithreaded processors tolerate such latencies by rapidly switchi...

متن کامل

Compiler-Controlled Multithreading for Lenient Parallel Languages1

Abstract: Tolerance to communication latency and inexpensive synchronization are critical for general-purpose computing on large multiprocessors. Fast dynamic scheduling is required for powerful non-strict parallel languages. However, machines that support rapid switching between multiple execution threads remain a design challenge. This paper explores how multithreaded execution can be address...

متن کامل

An Evaluation of Thread Migration for Exploiting Distributed Array Locality

Thread migration is one approach to remote memory accesses on distributed memory parallel computers. In thread migration, threads of control migrate between processors to access data local to those processors, while conventional approaches tend to move data to the threads that need them. Migration approaches enhance spatial locality by making large address spaces local, but are less adept at ex...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

J. UCS

دوره 6 شماره

صفحات -

تاریخ انتشار 2000

Compiler Generated Multithreading to Alleviate Memory Latency

نویسندگان

چکیده

منابع مشابه

Compiling irregular applications for reconfigurable systems

A Multithreaded Runtime System With Thread Migration for Distributed Memory Parallel Computing

Latency Tolerance through Multithreading in Large-Scale Multiprocessors

Compiler-Controlled Multithreading for Lenient Parallel Languages1

An Evaluation of Thread Migration for Exploiting Distributed Array Locality

عنوان ژورنال:

اشتراک گذاری